Hash-Binary Search: A Fast Technique for Searching an English Spelling Dictionary

نویسندگان

  • Douglas Comer
  • Vincent Y. Shen
چکیده

When a document Is prepared using a computer system, it can be checked for spelling errors automatically and efficiently. This paper presents the hash-binary method for searching a static table and applies It to searching an English spelling dictionary. Analysis shows that with only a small amount of space beyond that required to store the keys, the hash-binary search method pei— forms better than either hashing with open-addressing or binary search. Experiments with a sample dictionary verify the results. We also present extensions to account for skewed frequencies of access as well as methods for testing alternative hashing functIons.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Construction of FuzzyFind Dictionary using Golay Coding Transformation for Searching Applications

searching through a large volume of data is very critical for companies, scientists, and searching engines applications due to time complexity and memory complexity. In this paper, a new technique of generating FuzzyFind Dictionary for text mining was introduced. We simply mapped the 23 bits of the English alphabet into a FuzzyFind Dictionary or more than 23 bits by using more FuzzyFind Diction...

متن کامل

Weighted Unsupervised Learning for 3D Object Detection

searching through a large volume of data is very critical for companies, scientists, and searching engines applications due to time complexity and memory complexity. In this paper, a new technique of generating FuzzyFind Dictionary for text mining was introduced. We simply mapped the 23 bits of the English alphabet into a FuzzyFind Dictionary or more than 23 bits by using more FuzzyFind Diction...

متن کامل

Fast Similarity Search in Large Dictionaries

Fast similarity search is important for time-sensitive applications. Those include both enterprise and web scenarios, where typos, misspellings, and noise need to be removed in an efficient way, in order to improve data quality, or to find all information of interest to the user. This paper presents a new algorithm called Fast Similarity Search (FastSS) that performs an exhaustive similarity se...

متن کامل

English dictionary searching with little extra space

When text is typeset using a computer-based system, it can also be checked for spelling errors automatically and efficiently. Several methods of spelling error detection have been proposed. Morris et al. [MORR75] study statistical properties of English words, and describe an algorithm to catch possible typos by examining the relative frequency of trigrams (3-letter combinations). Kernighan et a...

متن کامل

Fast Phonetic Similarity Search over Large Repositories

Today there is a large amount of unstructured data produced by information systems from different domains. These sources may be analyzed for different purposes. Existing approaches use string similarity methods to search for valid words within a text, with a supporting dictionary. However, they have two main drawbacks. First, they are not rich enough to encode phonetic information to assist the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011